English-Hungarian NP Alignment in MetaMorpho TM

نویسنده

  • Gábor Pohl
چکیده

In this paper, a fast automatic NP alignment technique developed for MetaMorpho TM is presented. MetaMorpho TM is an EBMT-based translation memory that stores not only full sentence pairs but also NP pairs in its database of translations. In order to fulfill speed requirements of a translation memory (segments have to be stored quickly), in the proposed NP alignment algorithm time consuming statistical data collection is substituted for stemmed lexical matching using a bilingual dictionary, cognate matching and POS matching. A simple heuristic means of extracting Hungarian NP candidates without a deep parser is also presented in this paper. Parsed NPs of an English sentence are mapped to the words of the Hungarian translation and the shortest span containing all matched words is expanded to a full Hungarian NP using simple rules. The first experiment shows that high precision can be reached using the two algorithms discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MorphoLogic's Submission for the WMT 2009 Shared Task

In this article, we describe the machine translation systems we used to create MorphoLogic’s submissions to the WMT09 shared Hungarian to English and English to Hungarian shared translation tasks. We used our rule based MetaMorpho system to generate our primary submission. In addition, we created a hybrid system where the Moses decoder is used to rank translations or assemble partial translatio...

متن کامل

MetaMorpho: A Pattern-Based Machine Translation System

This paper describes an efficient real-time comprehension assistance and machine translation method. Combining the advantages of example-based (EBMT) and rule-based machine translation (RBMT), a new paradigm, pattern-based translation is presented. A system based on these principles that features an innovative user-friendly interface has been built. Called MetaMorpho, the system has been tested...

متن کامل

The MetaMorpho Translation System

In this article, we present MetaMorpho, a rule based machine translation system that was used to create MorphoLogic’s submission to the WMT08 shared Hungarian to English translation task. The architecture of MetaMorpho does not fit easily into traditional categories of rule based systems: the building blocks of its grammar are pairs of rules that describe source and target language structures i...

متن کامل

NP Alignment in Bilingual Corpora

We created a simple gold standard for English-Hungarian NP-level alignment, Orwell’s 1984, (since this already exists in manually verified POS-tagged format in many languages thanks to the Multex and MultexEast project) by manually verifying the automaticaly generated NP chunking (we used the yamcha, mallet and hunchunk taggers) and manually aligning the maximal NPs and PPs. The maximum NP chun...

متن کامل

Sentence Alignment of Hungarian-English Parallel Corpora Using a Hybrid Algorithm

We present an e cient hybrid method for aligning sentences with their translations in a parallel bilingual corpus. The new algorithm is composed of a length-based and anchor matching method that uses Named Entity recognition. This algorithm combines the speed of length-based models with the accuracy of anchor nding methods. The accuracy of nding cognates for Hungarian-English language pair is e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006